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Abstract 


In  this  note  a variant  of  the  classical  perturbation  theorem  for 
singular  values  is  given.  The  bound  explain  why  perturbations  will 
tend  to  increase  rather  than  decrease  singular  values  of  the  same  ordei 
of  magnitude  as  the  perturbation. 
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on  the  Perturbation  of 
Singular  Values 

G.  W.  Stewart 


Dedicated  to  A.  S.  Householder  on  his 
seventy-fifth  birthday 


In  this  note  we  shall  be  concerned  with  a sharpening  of  the  usual 
perturbation  bounds  for  the  singular  values  of  a general  rectangular  matrix. 
Specifically  let  X be  an  nxp  matrix.  Then  the  singular  values 

(1)  oj  i a2  a ...  > ap 

of  X may  be  defined  as  the  nonnegative  square  roots  of  the  eigenvalues  of 
XTX.  If  X = X + E Is  a perturbation  of  X and  the  singular  values 
Oj  i a2,  a ...  > o of  XTX  are  ordered  so  that 

(2)  Oj  a a2  > ...  > ap, 

then 

(3)  I a1  - of  | s II  E ||  (1  = 1,2,. ...p), 

where  II  E il  denotes  the  spectral  norm  of  E (for  definitions  and  proofs 
see,  e.g.,  [2}  ). 

Although  the  perturbation  bound  (3)  Is  satisfactory  in  most  applications. 
It  does  not  give  a complete  description  of  how  the  singular  values,  especially 
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the  small  ones,  actually  behave  under  perturbations  in  X.  The  following 
theorem  provides  a somewhat  clearer  picture. 

Theorem.  Let  the- singular  values  of  X be  ordered  as  in  (1)  and 
those  of  X = X + E as  in  (2).  Let  P denote  the  orthogonal  projection 
onto  the  column  space  of  X.  Then 

/ 

(4)  a?  = (ai  + £.j)2  + n2  (i  = 1,2,.  ...p) 

where 

(5)  | Cjl  £ IIP  E|| 

\ 

and 

Inf  [(I  - P)E]  s ni  s ||(I  - P)E||. 

Here 


inf (E)  = inf  HExll. 
II  X|| =1 


■ . 


(6) 


Proof.  We  use  the  classical  min-max  characterization 


min  max  xT(X^X)x. 


d1m(X)=p-1+l  xe  *X 
Mxil=l 


Let  X be  a subspace  for  which  equality  is  attained  in  (6).  Then  from  the 
min-max  characterization  of  o^,  we  have 


0?  s max  xT[(X+E)T(X+E)]x 
xe  y 
II  xrt=l 

= max  xT  [ XTX  + ETPX  + XTPE  + ETP2E  + ET(I-P)2e1  x, 
Xe  X 
1X11  = 1 


where  we  have  made  extensive  use  of  the  usual  properties  of  projections 
In  passing  from  the  first  to  the  second  form  of  the  bound.  Now  since 

a?  = max  x^(XTX)x  = max  HXxIl  , 

1 xe  xe  X 


|lxn=l 


iixn=l 


it  follows  that 


52  < o2  + 2 0i  II  PEI  + IIPEII2  + l|( I-P)El»2 
= (0i  + ttPE#)2  + l|(I-P)EI|2 


For  a lower  bound  we  use  the  dual  characterization 


o?  ■ max  min  xT(XTX)x. 

(8)  1 d1m(X)»1  xe  % 

x =1 

Again  let  ‘X  be  a subspace  for  which  equality  Is  attained  in  (8). 
Proceeding  as  above,  we  get 

o2  > min  xT[xTX  + ETPX  + XTPE  + ETP2e]  x + min  xTET(I-P2)Ex 

Xe  X Xe  X 


l|XH  ■ 1 


11X11  = 1 


> min  xT  [XTX  + ETPX  + XTPE  + ETP2e]  x + Inf  [(I-P)El2. 
XeX 

1X1  = 1 


(9) 


Let  x be  a vector  for  which  the  minimum  in  the  last  expression  of 

(9)  is  attained.  Calling  this  minimum  y,  we  have 

v = IIXxll2  + 2(xTXT) (PEx)  + lIPExll2  . 

> (l| Xx  li  - II  PEx  II)2. 
o 

Since  IIXxll  > c^,  we  have 

(aj  - UPEH)2  if  o.j  > N PE II 

(10)  v > 

0 if  ai  s II PE II 

The  theorem  now  follows  on  combining  (7),  (9),  and  (10).  ■ 

We  make  three  observations  on  this  theorem.  First,  there  is  a 
trivial  variant  in  which  PE  is  replaced  by  ER,  where  R is  the 
orthogonal  projection  onto  the  row  space  of  X. 

Second,  when  is  reasonably  larger  than  ||EI|  , say  > 5fcE|, 
the  first  term  in  the  bound  (4)  dominates  and  we  have 

^ * ®1  + «1 » 

where  satisfies  (5).  The  classical  perturbation  result  cited  at 
the  beginning  of  this  note  would  give  |^|  s IIEII.  Since  M PE II  < ||E|) 
our  result  Is  sharper;  and  In  fact  when  n»p  we  may  expect  PE  to 
be  significantly  smaller  than  IIEII  , so  that  (5)  represents  a true 
Improvement  over  the  classical  result. 


The  third  and  perhaps  most  Interesting  observation  Is  that  when 
Is  of  order  II Ell  , the  term  will  tend  to  dominate.  Now 
always  represents  an  Increase  In  the  singular  value,  and  when  n»p 
this  Increase  can  be  significant,  depending  as  It  does  on  (I-P)E  . 

To  put  the  matter  In  other  words,  If  one  takes  a matrix  with  a small 
singular  value  and  perturbs  It  by  quantities  of  the  same  size  as  that 
singular  value,  then  one  can  expect  the  singular  value  to  Increase,  not 
decrease.  This  tendency  toward  better  conditioned  matrices  with  larger 
<?p  has  been  observed  In  practice  In  connection  with  a regression  prob- 
lem In  which  simulated  random  perturbations  In  the  data  seriously  biased 
the  regression  coefficients  [l,3]  . 
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